π οΈ All DevTools
Showing 581–600 of 4302 tools
Last Updated
April 24, 2026 at 08:00 AM
[Other] Show HN: Modulus β Cross-repository knowledge orchestration for coding agents Hello HN, we're Jeet and Husain from Modulus (<a href="https://modulus.so" rel="nofollow">https://modulus.so</a>) - a desktop app that lets you run multiple coding agents with shared project memory. We built it to solve two problems we kept running into:<p>- Cross-repo context is broken. When working across multiple repositories, agents don't understand dependencies between them. Even if we open two repos in separate Cursor windows, we still have to manually explain the backend API schema while making changes in the frontend repo.<p>- Agents lose context. Switching between coding agents often means losing context and repeating the same instructions again.<p>Modulus shares memory across agents and repositories so they can understand your entire system.<p>It's an alternative to tools like Conductor for orchestrating AI coding agents to build product, but we focused specifically on multi-repo workflows (e.g., backend repo + client repo + shared library repo + AI agents repo). We built our own Memory and Context Engine from the ground up specifically for coding agents.<p>Why build another agent orchestration tool? It came from our own problem. While working on our last startup, Husain and I were working across two different repositories. Working across repos meant manually pasting API schemas between Cursor windows β telling the frontend agent what the backend API looked like again and again. So we built a small context engine to share knowledge across repos and hooked it up to Cursor via MCP. This later became Modulus.<p>Soon, Modulus will allow teams to share knowledge with others to improve their workflows with AI coding agents - enabling team collaboration in the era of AI coding. Our API will allow developers to switch between coding agents or IDEs without losing any context.<p>If you wanna see a quick demo before trying out, here is our launch post - <a href="https://x.com/subhajitsh/status/2024202076293841208" rel="nofollow">https://x.com/subhajitsh/status/2024202076293841208</a><p>We'd greatly appreciate any feedback you have and hope you get the chance to try out Modulus.
AEP (API Design Standard and Tooling Ecosystem)
Hacker News (score: 21)[API/SDK] AEP (API Design Standard and Tooling Ecosystem)
Launch HN: RunAnywhere (YC W26) β Faster AI Inference on Apple Silicon
Hacker News (score: 134)[Other] Launch HN: RunAnywhere (YC W26) β Faster AI Inference on Apple Silicon Hi HN, we're Sanchit and Shubham (YC W26). We built a fast inference engine for Apple Silicon. LLMs, speech-to-text, text-to-speech β MetalRT beats llama.cpp, Apple's MLX, Ollama, and sherpa-onnx on every modality we tested. Custom Metal shaders, no framework overhead.<p>Also, we've open-sourced RCLI, the fastest end-to-end voice AI pipeline on Apple Silicon. Mic to spoken response, entirely on-device. No cloud, no API keys.<p>To get started:<p><pre><code> brew tap RunanywhereAI/rcli https://github.com/RunanywhereAI/RCLI.git brew install rcli rcli setup # downloads ~1 GB of models rcli # interactive mode with push-to-talk </code></pre> Or:<p><pre><code> curl -fsSL https://raw.githubusercontent.com/RunanywhereAI/RCLI/main/install.sh | bash </code></pre> The numbers (M4 Max, 64 GB, reproducible via `rcli bench`):<p>LLM decode β 1.67x faster than llama.cpp, 1.19x faster than Apple MLX (same model files): - Qwen3-0.6B: 658 tok/s (vs mlx-lm 552, llama.cpp 295) - Qwen3-4B: 186 tok/s (vs mlx-lm 170, llama.cpp 87) - LFM2.5-1.2B: 570 tok/s (vs mlx-lm 509, llama.cpp 372) - Time-to-first-token: 6.6 ms<p>STT β 70 seconds of audio transcribed in *101 ms*. That's 714x real-time. 4.6x faster than mlx-whisper.<p>TTS β 178 ms synthesis. 2.8x faster than mlx-audio and sherpa-onnx.<p>We built this because demoing on-device AI is easy but shipping it is brutal. Voice is the hardest test: you're chaining STT, LLM, and TTS sequentially, and if any stage is slow, the user feels it. Most teams fall back to cloud APIs not because local models are bad, but because local inference infrastructure is.<p>The thing that's hard to solve is latency compounding. In a voice pipeline, you're stacking three models in sequence. If each adds 200ms, you're at 600ms before the user hears a word, and that feels broken. You can't optimize one stage and call it done. Every stage needs to be fast, on one device, with no network round-trip to hide behind.<p>We went straight to Metal. Custom GPU compute shaders, all memory pre-allocated at init (zero allocations during inference), and one unified engine for all three modalities instead of stitching separate runtimes together.<p>MetalRT is the first engine to handle all three modalities natively on Apple Silicon. Full methodology:<p>LLM benchmarks: <a href="https://www.runanywhere.ai/blog/metalrt-fastest-llm-decode-engine-apple-silicon">https://www.runanywhere.ai/blog/metalrt-fastest-llm-decode-e...</a><p>Speech benchmarks: <a href="https://www.runanywhere.ai/blog/metalrt-speech-fastest-stt-tts-apple-silicon">https://www.runanywhere.ai/blog/metalrt-speech-fastest-stt-t...</a><p>How: Most inference engines add layers between you and the GPU: graph schedulers, runtime dispatchers, memory managers. MetalRT skips all of it. Custom Metal compute shaders for quantized matmul, attention, and activation - compiled ahead of time, dispatched directly.<p>Voice Pipeline optimizations details: <a href="https://www.runanywhere.ai/blog/fastvoice-on-device-voice-ai-pipeline-apple-silicon">https://www.runanywhere.ai/blog/fastvoice-on-device-voice-ai...</a> RAG optimizations: <a href="https://www.runanywhere.ai/blog/fastvoice-rag-on-device-retrieval-augmented-voice-ai">https://www.runanywhere.ai/blog/fastvoice-rag-on-device-retr...</a><p>RCLI is the open-source voice pipeline (MIT) built on MetalRT: three concurrent threads with lock-free ring buffers, double-buffered TTS, 38 macOS actions by voice, local RAG (~4 ms over 5K+ chunks), 20 hot-swappable models, and a full-screen TUI with per-op latency readouts. Falls back to llama.cpp when MetalRT isn't installed.<p>Source: <a href="https://github.com/RunanywhereAI/RCLI" rel="nofollow">https://github.com/RunanywhereAI/RCLI</a> (MIT)<p>Demo: <a href="https://www.youtube.com/watch?v=eTYwkgNoaKg" rel="nofollow">https://www.youtube.com/watch?v=eTYwkgNoaKg</a><p>What would you build if on-device AI were genuinely as fast as cloud?
Show HN: Agentic Data Analysis with Claude Code
Show HN (score: 5)[Other] Show HN: Agentic Data Analysis with Claude Code Hey HN, as a former data analyst, Iβve been tooling around trying to get agents to do my old job. The result is this system that gets you maybe 80% of the way there. I think this is a good data point for what the current frontier models are capable of and where they are still lacking (in this case β hypothesis generation and general data intuition).<p>Some initial learnings: - Generating web app-based reports goes much better if there are explicit templates/pre-defined components for the model to use. - Claude can βhealβ broken charts if you give it access to chart images and run a separate QA loop.<p>Would either feedback from the community or to hear from others that have tried similar things!
I built a programming language using Claude Code
Hacker News (score: 72)[Other] I built a programming language using Claude Code
Show HN: A modern React onboarding tour library
Show HN (score: 8)[Other] Show HN: A modern React onboarding tour library react-tourlight is the modern React tour library. Zero dependencies, WCAG 2.1 AA accessible, under 5 kB gzipped. The one that works with React 19.
sepinf-inc/IPED
GitHub Trending[Other] IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Show HN: Ash, an Agent Sandbox for Mac
Show HN (score: 7)[Other] Show HN: Ash, an Agent Sandbox for Mac Ash is a macOS sandbox that restricts AI coding agents. It limits access to files, networks, processes, IO devices, and environment variables. You can use Ash with any CLI coding agent by wrapping it in a single command: `ash run -- <agent>`. I typically use it with Claude to stay safe while avoiding repetitive prompts: `ash run -- claude --dangerously-skip-permissions`.<p>Ash restricts resources via the Endpoint Security and Network Extension frameworks. These frameworks are significantly more powerful than the sandbox-exec tool.<p>Each session is driven by a policy file. Any out-of-policy action is denied by default. You can audit denials in the GUI app, which lets you view out-of-policy actions and retroactively add them to your policy file.<p>Ash also comes with tools for building policies. You can use an "observation session" to watch the typical behavior of a coding agent and capture that behavior in a policy file for future sandbox sessions. Linting, formatting, and rule merging are all built into the Ash CLI to keep your policy files concise and maintainable.<p>Download Ash at <a href="https://ashell.dev" rel="nofollow">https://ashell.dev</a>
Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs
Hacker News (score: 324)[Other] Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs I found that duplicating a specific block of 7 middle layers in Qwen2-72B, without modifying any weights, improved performance across all Open LLM Leaderboard benchmarks and took #1. As of 2026, the top 4 models on that leaderboard are still descendants.<p>The weird finding: single-layer duplication does nothing. Too few layers, nothing. Too many, it gets worse. Only circuit-sized blocks of ~7 layers work. This suggests pretraining carves out discrete functional circuits in the layer stack that only work when preserved whole.<p>The whole thing was developed on 2x RTX 4090s in my basement. I'm now running current models (GLM-4.7, Qwen3.5, MiniMax M2.5) on a dual GH200 rig (see my other post). Code and new models coming soon.<p>Happy to answer questions.
Show HN: Smux β Terminal Multiplexer built for AI agents
Show HN (score: 5)[CLI Tool] Show HN: Smux β Terminal Multiplexer built for AI agents
Show HN: DD Photos β open-source photo album site generator (Go and SvelteKit)
Hacker News (score: 51)[Other] Show HN: DD Photos β open-source photo album site generator (Go and SvelteKit) I was frustrated with photo sharing sites. Apple's iCloud shared albums take 20+ seconds to load, and everything else comes with ads, cumbersome UIs, or social media distractions. I just want to share photos with friends and family: fast, mobile-friendly, distraction-free.<p>So I built DD Photos. You export photos from whatever you already use (Lightroom, Apple Photos, etc.) into folders, run `photogen` (a Go CLI) to resize them to WebP and generate JSON indexes, then deploy the SvelteKit static site anywhere that serves files. Apache, S3, whatever. No server-side code, no database.<p>Built over several weeks with heavy use of Claude Code, which I found genuinely useful for this kind of full-stack project spanning Go, SvelteKit/TypeScript, Apache config, Docker, and Playwright tests. Happy to discuss that experience too.<p>Live example: <a href="https://photos.donohoe.info" rel="nofollow">https://photos.donohoe.info</a> Repo: <a href="https://github.com/dougdonohoe/ddphotos" rel="nofollow">https://github.com/dougdonohoe/ddphotos</a>
Show HN: Local-first firmware analyzer using WebAssembly
Show HN (score: 7)[Other] Show HN: Local-first firmware analyzer using WebAssembly Hi HN,<p>I just wanted to share what I have been working on for the past few months: A firmware analyzer for embedded Linux systems that helps uncovering security issues running entirely in the browser.<p>This is a very early Alpha. It is going to be rough around the edges. But I think it provides quite a lot of value already.<p>So please go ahead and drop a firmware (only .tar rootfs archives for now) and try to break it :)
promptfoo/promptfoo
GitHub Trending[Testing] Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
OpenTelemetry for Rust Developers
Hacker News (score: 11)[Other] OpenTelemetry for Rust Developers
Removing recursion via explicit callstack simulation
Hacker News (score: 11)[Other] Removing recursion via explicit callstack simulation
Claude Code, Claude Cowork and Codex #5
Hacker News (score: 31)[Other] Claude Code, Claude Cowork and Codex #5
Show HN: I Was Here β Draw on street view, others can find your drawings
Hacker News (score: 36)[Other] Show HN: I Was Here β Draw on street view, others can find your drawings Hey HN, I made a site where you can draw on street-level panoramas. Your drawings persist and other people can see them in real time.<p>Strokes get projected onto the 3D panorama so they wrap around buildings and follow the geometry, not just a flat overlay. Uses WebGL2 for rendering, Mapillary for the street imagery.<p>The idea is for it to become a global canvas, anyone can leave a mark anywhere and others stumble onto it.
The Cost of 'Lightweight' Frameworks: From Tauri to Native Rust
Hacker News (score: 14)[Other] The Cost of 'Lightweight' Frameworks: From Tauri to Native Rust
Oracle is building yesterday's data centers with tomorrow's debt
Hacker News (score: 159)[Other] Oracle is building yesterday's data centers with tomorrow's debt